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Abstract 

The classical Frobenius problem is to compute the largest number g not repre- 
sentable as a non- negative integer linear combination of non- negative integers xi, X2, ■ ■ ■ , Xk, 
where gcd(xi, X2, ■ ■ ■ , x/.) = 1. In this paper we consider generalizations of the Frobe- 
nius problem to the noncommutative setting of a free monoid. Unlike the commutative 
case, where the bound on g is quadratic, we are able to show exponential or subexpo- 
nential behavior for an analogue of g, depending on the particular measure chosen. 

1 Introduction 

Let x\, X2, ■ ■ ■ , Xf. be positive integers. It is well-known that every sufficiently large inte- 
ger can be written as a non-negative integer linear combination of the Xi if and only if 
gcd(xi,x 2 , ...,x k ) = l. 

The Frobenius problem (so-called because, according to Brauer [2j, "Frobenius mentioned 
it occasionally in his lectures") is the following: 

Given positive integers xi,X2, ■ ■ ■ , xj- with gcd(xi, X2, ■ ■ ■ , x k ) = 1, find the largest pos- 
itive integer g(xi,X2, ■ ■ ■ ,x k ) which cannot be represented as a non-negative integer linear 
combination of the x^ . 

Example 1. The Chicken McNuggets Problem ([291 PP- 19-20, 233-234], [22]). If Chicken 
McNuggets can be purchased at McDonald's only in quantities of 6, 9, or 20 pieces, what is 
the largest number of McNuggets that cannot be purchased? The answer is g(6, 9, 20) = 43. 
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Although it seems simple at first glance, the Frobenius problem on positive integers has 
many subtle and intriguing aspects that continue to elicit study. A recent book by Ramirez 
Alfonsm [24] lists over 400 references on this problem. Applications to many different fields 
exist: to algebra [19]; the theory of matrices [11], counting points in polytopes [I]; the 
problem of efficient sorting using Shellsort [111 [251, [301 [26] ; the theory of Petri nets [28J; the 
liveness of weighted circuits [8] ; etc. 

Generally speaking, research on the Frobenius problem can be classified into three dif- 
ferent areas: 

• Formulas or algorithms for the exact computation of g(xi, . . . , Xk), including formulas 
for g where the obey certain relations, such as being in arithmetic progression; 

• The computational complexity of the problem; 

• Good upper or lower bounds on g(xi, . . . , Xk). 
For k — 2, it is folklore that 

g(x 1 ,x 2 ) = xix 2 - xi - x 2 ; (1) 

this formula is often attributed to Sylvester [27], although he did not actually state it. Eq. ([T]) 
gives an efficient algorithm to compute g for two elements. For k = 3, efficient algorithms 
have been given by Greenberg [14] and Davison [10]; if X\ < x 2 < x%, these algorithms run in 
time bounded by a polynomial in log £3. Kannan [T7J [18] gave a very complicated algorithm 
that runs in polynomial time in logXk if k is fixed, but is wildly exponential in k. However, 
Ramirez Alfonsm [23] proved that the general problem is NP-hard, under Turing reductions, 
by reducing from the integer knapsack problem. So it seems very likely that there is no 
simple formula for computing g(xi,x 2 , . . . ,Xk) for arbitrary k. Nevertheless, recent work 
by Einstein, Lichtblau, Strzebonski, and Wagon [12] shows that in practice the Frobenius 
number can be computed relatively efficiently, even for very large numbers, at least for k < 8. 

Another active area of interest is estimating how big g is in terms of x±, x 2 , . . . , xj~ for 
X\ < x 2 < ■ ■ ■ < Xk- It is known, for example, that g(xi, x 2 , . . . , Xk) < x\. This follows from 
Wilf's algorithm [31] . Many other bounds are known. 

One can also study variations on the Frobenius problem. For example, given positive 
integers xi,x 2 , . . . ,Xk with gcd(xi, x 2 , . . . , Xk) = 1, what is the number f(x\,x 2 , . . . , Xk) of 
positive integers not represented as a non-negative integer linear combination of the x{l 
Sylvester, in an 1884 paper [27], showed that f(xi,x 2 ) = — l)(x 2 — 1). 

Our goal in this paper is to generalize the Frobenius problem to the setting of a free 
monoid. In this framework, we start with a finite, nonempty alphabet E, and consider the 
set of all finite words E*. Instead of considering integers x±,x 2 , . . . ,Xk, we consider words 
Xi,x 2 , . . . , Xk G E*. Instead of considering linear combinations of integers, we instead con- 
sider the languages {xi,x 2 , . . . , Xk}* and l. Actually, we consider several additional 
generalizations, which vary according to how we measure the size of the input, conditions on 
the input, and measures of the size of the result. For an application of the noncommutative 
Frobenius problem, see Clement, Duval, Guaiana, Perrin, and Rindone (9J. 
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In order to motivate our definitions, we consider the easiest case first: where X = {0}, a 
unary alphabet. 

2 The unary case 

Suppose Xi = a % for 1 < i < k. The Frobenius problem is evidently linked to many 
problems over unary languages. It figures, for example, in estimating the size of the smallest 
DFA equivalent to a given NFA [7] . 

If L C £*, by L we mean £* — L, the complement of L. If L is a finite language, by \L\ 
we mean the cardinality of L. Evidently we have 

Proposition 2. Suppose Xi = ai for 1 < i < k, and write S = {x±, X2, ■ ■ ■ , x k }. Then S* is 
co-finite if and only gcd(ai, 02, ... , a k ) = 1- Furthermore, if S* is co-finite, then the length 
of the longest word in S* is g(ai, 02, ... , a k ), and \S*\ = f(a±, a 2 , . . . , a k ). 

This result suggests that one appropriate noncommutative generalization of the condi- 
tion gcd(ai, a 2 , . . . , Ok) = 1 is that S* = {xi,x 2 , ■ ■ ■ , x k }* be co-finite, and one appropriate 
generalization of the g function is the length of the longest word not in S*. 

But there are other possible generalizations. Instead of measuring the length of the 
longest omitted word, we could instead consider the state complexity of S*. By the state 
complexity of a regular language L, written sc(L), we mean the number of states in the 
(unique) minimal deterministic finite automaton (DFA) accepting L. In the unary case, this 
alternate measure has a nice expression in terms of the ordinary Frobenius function: 

Theorem 3. Let gcd(ai, a 2 , . . . , a&) = 1. Then 

sc({0 ai , a2 , . . . , a "}*) = g(a u a 2 , . . . , a*) + 2. 

Proof. Since gcd(ai, a 2 , . . . , a^) = 1, every word of length > g(ai, a 2 , . . . , a^) will be in 
the set {0 ai ,0 a2 ,...,0 afc }*. Thus we can accept {0 fll , a2 , . . . , afc }* with a DFA having 
g(a\, . . . , Ok) + 2 states, using a "tail" of g(a\, . . . , a/-) + 1 states and a "loop" of one accept- 
ing state. Thus sc({0 ai , a2 , . . . , afc }*) < g( y a 1 ,a 2 , . . . ,a k ) +2. 

To see sc({0 ai , a2 , . . . , afc }*) > g(a u a 2 , . . . , a k ) + 2, we show that the words 

e,0,0 2 ,...,0 9(ai '-' afc)+1 

are pairwise inequivalent under the Myhill-Nerode equivalence relation. Pick 0* and 0- 7 , 
< i < 3 < g(ai,...,a k ) + 1. Let L = {0 ai , a2 , . . . , 0°*}*. Choose z = off(°i.-.°*)-*. Then 
0*2 = 9(ai '-' afc) g L, while j z = Q9{^,...,a k )+j-i £ ^ gince ■ > L n 

Corollary 4. Let gcd(ai, . . . , a k ) = d. Then 

sc({0 a \ a2 , . . . , afc }*) = d(g( ai /d, a 2 /d, . . . , a k /d) + !) + !. 



Hence it follows that sc({0 ai , a2 , . . . , ak }*) = 0{a k v ). Furthermore, this bound is essen- 
tially optimal; since g(n, n+1) = n 2 —n—l, there exist examples with sc({0 ai , a2 , . . . , afe }*) = 

n(al). 
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3 The case of larger alphabets 



We now turn to the main results of the paper. Given as input a list of words X\,x 2 , . . . ,Xk, 
not necessarily distinct, and defining S = {x 1 ,x 2 , . . . ,Xk, we can measure the size of the 
input in a number of different ways: 

(a) k, the number of words; 

(b) n = maxi<j<fc the length of the longest word; 

(c) m = Xa<j</c \ x i\i the total number of symbols; 

(d) sc({xi,x 2 , ■ ■ ■ , Xk}), the state complexity of the language represented by the input. 

(e) nsc({xi, x 2 , ■ ■ ■ , Xk}), the nondeterministic state complexity of the language represented 
by the input. 

We may impose various conditions on the input: 

(i) Each Xi is defined over the unary alphabet; 

(ii) S* = {x±,x 2 , . . . , Xk}* is co-finite 

(iii) k = 2; 

(iv) k is fixed. 

And finally, we can explore various measures on the size of the result: 
1. C — max xe s*-s* \x\, the length of the longest word not in S*; 



2. 


K -- 


= max. xe -E*- x * x *... x * \x\, the length of the longest word not 


3. 


S = 


- sc(S*), the state complexity of S*; 


4. 




= nsc(<S'*), the nondeterministic state complexity of S*; 


5. 


M 


= E* — S*\, the number of words not in S*; 


6. 


S' -- 


— CPl '7'» * '7'* * . . T* 1 * 


7. 


W 


= nsc(xlx*, ■ ■ ■ x* k ) 



Clearly not every combination results in a sensible question to study. In order to study 
C, the length of the longest word omitted by S*. we clearly need to impose condition (ii), 
that S* be co-finite. 

We now study under what conditions it makes sense to study K = max xe £*_ x * x *... x * \x\, 
the length of the longest word not in x\x*, ■ ■ -x* k . 
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Theorem 5. Let xi, x 2 , ■ ■ ■ , x k G S + . Then L = x\x* 2 ■ ■ ■ x\ is co-finite if and only if |E| = 1 
and gcd(|xi|, . . . , \x k \) = 1. 

Proof. If |S| = 1 and gcd(|xi|, . . . , \x k \) = 1, then a unary word of every sufficiently long 
length can be attained by concatenations of the Xi, so L is co-finite. 

For the other direction, suppose L is co-finite. If |S| = 1, let gcd(|xi|, . . . , \xk\) = d. If 
d > 1, L contains only words of length divisible by d, and so is not-cofinite. So d — 1. 

Hence assume |S| > 2, and let a, b be distinct letters in S. Let I = maxi<j<fc \xi\, the 
length of the longest word. Let V = ((a 2l b 2l ) k ) + . Then we claim that V H L — 0. For 
if none of the Xi consist of powers of a single letter, then the longest block of consecutive 
identical letters in any word in L is < 21, so no word in V can be in L. Otherwise, say some 
of the Xi consist of powers of a single letter. Take any word w in L, and count the number 
n(w) of maximal blocks of 21 or more consecutive identical letters in w. (Here "maximal" 
means such a block is delimited on both sides by either the beginning or end of the word, 
or a different letter.) Clearly n(w) < k. But n(w') > 2k for any word in V . Thus L is not 
co-finite, as it omits all the words in V . □ 



4 State complexity results 

In this section we study the measures S = sc(S*), Af = nsc(S'*), and S' = sc(x^2 ■ ■ -xl). 
consider some results on state complexity. First we review previous results. 

Yu, Zhuang, and Salomaa [32] showed that if L is accepted by a DFA with n states, then 
L* can be accepted by a DFA with at most 2 n_1 + 2 n ~ 2 states. Furthermore, they showed 
this bound is realized, in the sense that for all n > 2, ther exists a DFA M with n states 
such that the minimal DFA accepting L(M)* needs 2 n ~ 1 + 2 n ~ 2 states. This latter result 
was given previously by Maslov [2T] . 

Campeanu, Culik, Salomaa, and Yu [31 [3] showed that if a DFA with n states accepts 
a finite language L, then L* can be accepted by a DFA with at most 2"~ 3 + 2 n_4 states 
for n > 4. Furthermore, this bound is actually achieved for n > 4 for an alphabet of size 
3 or more. Unlike the examples we are concerned with in this section, however, the finite 
languages they construct contain exponentially many words in n. 

Holzer and Kutrib [15] examined the nondeterminstic state complexity of Kleene star. 
They showed that if an NFA M with n states accepts L, then L* can be accepted by an NFA 
with n + 1 states, and this bound is tight. If L is finite, then n — 1 states suffices, and this 
bound is tight. 

Campeanu and Ho 0] gave tight bounds for the number of states required to accept a 
finite language whose words are all bounded by length n. 

Proposition 6. 

(a) nsc({xi,a;2, . . . ,x k }*) < m - k + 1. 

(b) sc{{xi,x 2 ,...,x k }*) < 2 m ~ k+1 . 
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(c) If no Xi is a prefix of any other Xj, then sc({xi, x%, . . . , Xk}*) < m — k + 2. 

Proof, (a) Form an NFA from the trie for the words x±, . . . , x^, sharing a common initial 
state q , and having the transition on the last letter of each word go back to q Q . This 
NFA will have m — k + 1 nodes. 

(b) Take the NFA from part (a) and apply the subset construction. 

(c) If no Xi is a prefix of any other Xj, then the NFA constructed in part (a) is actually a 
DFA. One extra state is needed as a "dead" state. 

□ 

We now consider an example providing a lower bound for the state complexity of {xi, X2, ■ ■ ■ , 
Let t be an integer > 2, and define words as follows: 

y ■= 01* _1 



:= l*"*-^!*" 1 " 1 , < i < t - 2 . 



Let S t := {0, x ,X!,..., x t - 2 , y}- 
Thus, for example, 

:= {0, 1111101, 1111011, 1110111, 1101111, 1011111,0111110}. 

Theorem 7. 5 t * has state complexity 3t2 t ~ 2 + 2 t ~ 1 . 

The proof of this theorem is rather complicated, so we give a proof of the following 
slightly weaker result: 

Theorem 8. sc(S* t ) > 2*~ 2 . 

Proof. First, we create an NFA M t with 3t — 1 states that accepts S%. This NFA has states 

Q = {po,Pi,---,Pt,Qi,g2,---,Qt-i,ri,r 2 ,...,r t - 1 } 

with only one final state F = {po}. 
For example, here is the NFA M 6 . 
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Figure 1: The NFA M 6 

We now determine 5(q, z) for each state q of M t and each element of z E S t . The reader 
can verify that 





= {r t -i,Po} 




= {po}, l<i<t-2 


5(pi,Xj) 


iq h if j = i — 1; 
1 0, otherwise. 


S(pt,y) 


= {n-i} 


S(qi,y) 


= {Po,Pi} 




= {ri-i}, 2<i<t-l 




= If forl<i<i 
1 0, otherwise. 




= 0, 1 < i < t - 1 



-1,0< j <t-2 



{{<li,Pj+i}, ifj = i-l; 
{r;}, if j < i — 1; for 1 < i < * - 1,0 < j < t- 2 

0, otherwise. 

From these relations, we deduce that 
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S{{qi,q i+ i,. . . ,q t -i,p t ,Po},y) = r h . . . , r t _i,p } 

5({r i+1 ,r i+2 , ■ ■ ■ ,r t - U po},Xi) = {q i+1 ,p i+2 , r i+2 , r i+3 , . . . ,r t _i,p } 
ft+i, • • • ,Qj,Pj+i, r j+1 ,r j+2 , . . . ,7Vi,p },Zj) = {ftjft+i, • • • , Qj+i,Pj+2, r j+2 , r j+3 , . . .,r t _ u p }, 

Hi<j<t-3 

5({qi, q i+ i,..., q t -2,Pt-i,r t -i,Po},Xj) = {qi, ft+i, • • • , q t -i,Pt, r t -i,Po} 

Let T be any subset of {r 1; r 2 , . . . , r t _ 2 }, and write T = {r^, r i2 , . . . , r^.} for j indices 

1 < ^ < i 2 < • • • < ij < t - 2. 

We claim that the 2'~ 2 words 

V Xt-2V x t - 3 x t _ 2 y x t _ A x t _ 3 x t _ 2 y ■ ■ ■ xix 2 ■ ■ ■ x t _ 2 y x h x i2 ■ ■ ■ x^y 

where 

1 < i\ < i 2 < • • • < ij < t - 2, 

are pairwise inequivalent under the Myhill-Nerode equivalence relation. 

To show this, we first argue that any subset of states of the form T' := {po,r t -i} U T, 
where T is as in the previous paragraph, is reachable from p . From the relations above we 
see that the following path reaches T'\ 

{Po} {Po,n-i} ^ {Qt-i,Pt,Po} {r t - 2 ,r t -i,Po} ^ 

{5 t _ 2 ,p t _i,r t _i,po} ^ {? t _2,ft-i,Pt,Po} {rt_ 3 ,r t _2,r t _i,po} ^ 
{?t-3,Pt-2,r t _2,r t _i,p } ^ {%-3,?t-2,Pt-l,r t _i,Po} ^ {<7t-3, ?t-2, <7t-i,Pt,Po} 

{rt_4,rt_ 3 ,r t _2,r t _i,po} • • • 
— ► {po,ri,r 2 ,...,rt-i} 

x i2 '" X ij y 

— ► {ri 1 ,ri 2 ,...,r ij ,rt-i,po}. 

Finally, we argue that each of these subsets of states is inequivalent. This is because 
given two distinct such subsets, say V and T" , there must be an r i: 1 < % < t — 2, that is 
contained in one (say T") but not the other. Then reading the word takes T' to p , but 
not T". □ 

Corollary 9. There exists a family of sets S t , each consisting of t + 1 words of length 
< t + 1, snc/i £/ia£ sc(S£) = 2 n( ^. If m is the total number of symbols in these words, then 
sc(S*) = 2 n ^. 
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Using the ideas in the previous proof, we can also create an example achieving subexpo- 
nential state complexity for 

Theorem 10. As before, define 

y : = 01* -1 

Xi := l t-<-1 01 <+1 , < i < t - 2 . 



Let L = (0*xlxl ■ ■ -x*.!?/*) 6 where e = (t + l)(t — 

Proof. Define A = {xq, X\, . . . , x t -2, y, 0} and T = 
say {s\, &2, . . . , Sj} with si < S2 < ■ ■ ■ Sj define 



2)/2 + 2t. Then sc(L) > 2*- 2 . 

{x\, X2, ■ ■ ■ , x t -2}- For any subset S of T, 



x(S) = yx t -. 2 yxt-&t-2y ■ ■ ■ yxix 2 ■ ■ ■ x t ^ 2 yx Sl x S2 ■ ■ ■ x Sj y. 

Note that x(S) contains t copies of y and at most (t - 2)(t - l)/2 + 1 - 2 = (t + l)(t - 2) /2 
x's. Thus \x(S)\ <(t + l)(t + (t+ l)(t - 2)/2) and |ar(5)| < 2t + (t + l)(t - 2)/2. 

To get the bound sc(L) > 2*~ 2 , we exhibit 2 t ~ 2 pairwise distinct word under the Myhill- 
Nerode equivalence relation. Pick two distinct subsets of T, say R and S. Since R ^ S, 
there exists an element in one not contained in the other. Without loss of generality, let 
m G -R, m £ S. By the proof of Theorem [8] we have x{R)l t ~ m G A* but x{S)l t ' m A*. 
Since L C A*, a;(5)l'- m ^ L. It remains to see x(R)l t ~ m G L. 

Since x(i?)l*~ m G A*, there exists a factorization of x(-R)l*~ m in terms of elements of A. 
However, 

IxiR)! 1 -™] < \x{R)\+t 

< (t+l)(t+(t + l)(t-2)/2 + t) 

so any factorization of x{R)l t ~ m into elements of A contains at most (t + l)(t — 2)/2 + 2t 
copies of words other than 0. Similarly 

Ix^l^lo < \x{R)\ 

< (t + l)(t-2)/2 + 2t 

so any factorization of x{R)l t ~ m into elements of A contains at most (t+l)(t— 2)/2+2t copies 
of the word 0. Thus a factorization of x{R)l t ~ m into elements of A is actually contained in 
L. □ 

Corollary 11. There exists an infinite family of tuples (a^i,^, • • • ,Xk) where m, the total 
number of symbols, is 0(t 4 ) ; and and sc(x\ ■ ■ ■ x* k ) = 2 n ^ . 

We now turn to an upper bound on the state complexity of S* in the case where the 
number of words in S is not specified, but we do have a bound on the length of the longest 
word. 
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Theorem 12. Let S = {xx,X2, ■ ■ ■ ,Xk} be a finite set with maxi<j<& \x{\ = n, that is, the 
longest word is of length n. Then sc(S*) < 2 \z\-i (2"|S| n — 1). 

Proof. The idea is to create a DFA M = (Q, E, 8, qo, F) that records the last n — 1 symbols 
seen, together with the set of the possible positions inside those n — 1 symbols where the 
factorization of the input into elements of S could end. 

Our set of states Q is defined by {[w,T] : \w\ < n, S C {0, 1, . . . , \w\}. The intent is 
that the DFA reaches state [x, T] on input y = y\yi • ■ ■ y% if and only if \x\ = max(n — 1, i), 
x is a suffix of y, and 

T = {a :0<a<x and y x y 2 ■ ■ ■ y n -a G S*}. 

The initial state is [e, {0}] and the set of final states is {[x, T] : G T}. 
To maintain the invariant, we define our transition function S as follows: 
If \x\ < n — 1, then 5([x, T],a) = [xa, U] where 




(T + 1) U {0}, if a suffix of length i + 1 of xa is in S for some ieT; 
(T + 1), otherwise. 



If \x\—n — 1, then 5([bx, T], a) = [xa, U] where 

f ((T + 1) — {n}) U {0}, if a suffix of length i + 1 of bxa is in S for some i E T; 
1 (T + 1) — {n}, otherwise. 

Verification that the construction works is left to the reader. The number of states is 

E < l <n|S|^ +1 = ^ rT (2«|Sr-l). □ 



5 State complexity for two words 

In this section we develop formulas bounding the state complexity of {w,x}* and w*x*. 
Here, as usual, g(xi,x 2 ) denotes the Frobenius function introduced in Section [TJ 

We need the following lemma, which is of independent interest and which generalizes a 
classical theorem of Fine and Wilf [13J. 

Lemma 13. Let w and x be nonempty words. Let y G w{w,x} UJ and z G x{w,x} UJ . Then 
the following conditions are equivalent: 

(a) y and z agree on a prefix of length \w\ + \x\ — gcd(|u>|, \x\); 

(b) wx = xw; 

(c) y = z. 

Furthermore, the bound in (a) is optimal, in the sense that for all pairs of lengths (m,n) 
there exists a pair of words (\w\, \x\) such that and x u agree on a prefix of length \w\ + 
|x| — gcd(|u>|, \x\) — 1. 
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Proof, (a) =>- (b): We prove the contrapositive. Suppose wx ^ xw. Without loss of 
generality, we can assume gcd(|w|, \x\) = 1, for if not, we group the symbols of w and x into 
blocks of size d = gcd(|w|, \x\), obtaining new words over a larger alphabet whose lengths 
are relatively prime. 

Then we prove that y and z differ at a position < \w\ + \x\ — 1. The proof is by induction 
on \w\ + \x\. 

The base case is \w\ + \x\ = 2. Then \w\ = \x\ = 1. Since wx ^ xw, we must have w = a, 
x = b with a ytz b. Then y and z differ at the l'st position. 

Now assume true for |iu| + \x\ < k. We prove it for \w\ + \x\ = k. If |iw| = \x\ then y 
and z must disagree at the |iw|'th position or earlier, for otherwise w = x and wx = xw, and 
| it? | < \w\ + \x\ — 1. So, without loss of generality, assume \w\ < \x\. If w is not a prefix of 
x, then y and z disagree on the |u>|'th position or earlier, and again \w\ < \w\ + \x\ — 1. 

So w is a proper prefix of x. Write x = wt for some nonempty word t. Now wt ^ tw, for 
if so, then to = wwt = wtw = xw. Then y = ww ■ ■ ■ and z = wt ■ ■ ■ . By induction (since 
\w\ + \t\ < k) w -1 y and w~ 1 z disagree at position \w\ + \t\ — 1 or earlier. Hence y and z 
diagree at position 2\w\ + \t\ — 1 = |w| + |s| — 1 or earlier. 

(b) (c): If wx = xw, then by the theorem of Lyndon-Schutzenberger, both w and 
x are powers of a common word u. Hence y = u u = z. 

(c) =^ (a): Trivial. 

For the optimality statement, the words constructed in the paper [B] suffice. □ 
Theorem 14. Letw,x G S + . Then 

J|ty| + |x|, ifwx^xw; 
sc{{w, x\ ) = < 

I d(g(\w\/d, \x\/d) + 1) + 2, i/ = and d = gcd(|w|, . 
Furthermore, this bound is tight. 

Proof. If wx = xw, then by a classical theorem of Lyndon and Schiitzenberger [20], we know 
there exists a word z and integers z, j > 1 such that w = z\ x = zK Thus {w, x}* = {z\ z^}*. 
Let e = gcd(z, j). Then L = {z l , z^}* consists of all words of the form z ke for k > giije, j /e), 
together with some words of the form z ke for < k < giije, j /e). Thus, as in the proof of 
Corollary HI we can accept L with a "tail" of e\z\g(i/e,j/e) + 1 states and a "loop" of e\z\ 
states. Adding an additional state as a "dead state" to absorb unused transitions gives a 
total of (e\z\(g(i/e,j/e) + 1) + 2 states. Since d = e\z\, the bound follows. 

Otherwise, xw ^ wx. Without loss of generality, let us assume that \w\ < \x\. Suppose 
w is not a prefix of x. Let p be the longest common prefix of w and x. Then we can write 
w = paw' and x = pbx' for a ^ b. Then we can accept {w,x}* with a transition diagram 
that has one chain of nodes labeled p leading from q Q to a state q, and two additional chains 
leading from q back to go, one labled aw' and one labeled bx' . Since a ^ b, this is a DFA. One 
additional "dead state" might be required to absorb transitions on letters not mentioned. 
The total number of states is \p\ + 1 + \w'\ + \x'\ + 1 < \w\ + |x|. 
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Finally, suppose \w\ < \x\ and w is a prefix of x. We claim it suffices to bound the longest 
common prefix between any word of w{w, x}* and x{w, x}*. For if the longest common prefix 
is of length b, we can distinguish between them after reading 6+1 symbols. The b + l'th 
symbol must be one of two possibilities, and we can use back arrows in the transition diagram 
to the appropriate state. We may need one additional state as a "dead state" , so the total 
number of states needed is b + 2. But from Lemma O we know b < \w\ + \x\ — 2. □ 

Theorem 15. Letw,x G S + . Then 

\\w\ + 2\x\, ifwx ^ wx; 

SC[W X ) = < 

\d(g(\w\/d, \x\/d) + 1) + 2, if wx = xw and d = gcd(|w|, \x\) . 
Proof. Similar to the proof of the previous theorem. Omitted. □ 



6 Longest word omitted 

In this section we assume that S = {x±, X2, ... , Xk} for finite words x%, x%, . . . , Xk, and S* is 
co-finite. We first obtain an upper bound on the length of the longest word not in S*. 

Theorem 16. Suppose \xi\ < n for all i. Then if S* is co-finite, the length of the longest 
word not in S* is < 2|s ^_ 1 (2 w |S| n - 1). 

Proof. Given S, construct the DFA accepting S* by the construction of Theorem [12j The 
resulting DFA has q = 2 \x\-i (2 n |£| n — 1) states. Now change the "finality" of each state, so 
a final state becomes non-final and vice versa. This new DFA accepts S*. Then the longest 
word accepted is the length of a longest path to a final state, which is at most q — 1. □ 

In the rest of this section we show that the length of the longest word not in S* can be 
exponentially long in n. We need several preliminary results first. 

We say that x is a proper prefix of a word y if y = xz for a nonempty word z. Similarly, 
we say x is a proper suffix of y if y = zx for a nonempty word z. 

Proposition 17. Let S be a finite set of nonempty words such that S* is co-finite, and 
S* 7^ S*. Then for all x G S , there exists x' G S such that x is a proper prefix of x, or vice 
versa. Similarly, for all x G S , there exists x' G S such that x is a proper suffix of x' , or vice 
versa. 

Proof. Let x G S. Since S* ^ £*, there exists v G S*. Since S* is co-finite, S* fl x * v is 
nonempty. Let i > be the smallest integer such that x l v G S*; then i > 1, for otherwise 
v G S*. Since x l v G S*, there exist y%, y 2 , ■ ■ ■ ,Vj G S such that x l v = y\y% . . . yj. Now y\ ^ x, 
for otherwise by cancelling an x from both sides, we would have x l ~ 1 v G S*, contradicting 
the minimality of i. If \x\ < \y\\, then a; is a proper prefix of y\, while if \x\ > \y\\, then y\ 
is a proper prefix of x. 

A similar argument applies for the result about suffixes. □ 
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Next, we give two lemmas that characterize those sets S such that S* is co-finite, when 
S is a set S containing words of no more than two distinct lengths. 

Lemma 18. Suppose S C S m U Y7 l , < m < n, and S* is co-finite. Then E m C S. 

Proof. If S* = £*, then S must contain every word x of length m, for otherwise 5* would 
omit x. So assume S* ^ £*. 

Let x G S m . Then 5 1 * H x£* is nonempty, since S* is co-finite. Choose v such that 
xv G S 1 *; then there is a factorization xv = yxy2 ■ ■ -yj where each yi G S. If y\ G S m , then 
x = yi and so x G S*. Otherwise yi G S n . By Proposition [TTJ, there exists z G S 1 such that 
?/i is a proper prefix of z or vice versa. But since S contains words of only lengths m and n, 
and y% G S n , we must have z G S m , and z is a prefix of yx- Then x = z, and so x G S. □ 

Lemma 19. Suppose S C S m U S™ ; wrai/i < m < n < 2m and S* is co-finite. Then 
C S*, where I = m|S| n " m + n-m. 

Proof. Let x be a word of length / that is not in S*. Then we can write x uniquely as 

x = y^yxZx ■ ■ ■ y|£|n-»n-l2|E|«-™-l2/|£|™-™, (2) 

where y { G S n " m for < % < |S| n " m , and z { G S 2m " n for < i < |S| n ~ m . 

Now suppose that yiZiy^x G 5 for some % with < i < |E| n_m . Then we can write 




Note that \yjZj\ = \zuyk\ = rri- From Lemma [TBI each term in this factorization is in S. 
Hence x G S*, a contradiction. It follows that 

yiz iyi+1 S for all i with < i < |S| n " m . (3) 

Now the factorization of x in Eq. uses |S|" _m + 1 y's, and there are only |E| n_m 
distinct words of length n — m. So, by the pigeonhole principle, we have y p = y q for some 
< v < q < |£| n_m . Now define 

u = VqZq • • • yp-iZp-i 
v = y P z p ■ ■ ■ y q -\z q - x 
w = y q z 9 ~-y\v\n-m, 

so x = uvw. Since S* is co-finite, there exists a smallest exponent k > such that uv k w G S* . 

Now let uv k w = X\X2 ■ ■ • Xj be a factorization into elements of S. Then x\ is a word of 
length m or n. If |xi| = n, then comparing lengths gives x± = yo^o2/i- But by ([3]) we know 
UoZiyi & S. So |xi| = m, and comparing lengths gives X\ = VqZq. By similar reasoning we see 
that X2 = yiZi, and so on. Hence Xj = y|E[«- m -i^[E|»-"»-iy[2[' i - m £ 5 1 - But this contradicts 

Thus, our assumption that x ^ S* must be false, and so x G S 1 *. Since x was arbitrary, 
this proves the result. □ 
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Now we can prove an upper bound on the length of omitted words, in the case where S 
contains words of at most two distinct lengths. 

Theorem 20. Suppose S C S m U S n ; where < m < n < 2m, and S* is co-finite. Then 
S* 7^ S* ; and the length of the longest word not in S* is < g(m,l) = ml — m — I, where 
I = mlSl 71-7 " + n — m. 

Proof. Any word in S* must be a concatenation of words of length m and n. If gcd(m, n) = 
d > 1, then S* omits all words whose length is not congruent to (mod d), so S* is not 
co-finite, contrary to the hypothesis. Thus gcd(m, n) = 1. Then S* omits all words of length 
g(m, n), so S* ^ £*. 

By Lemmas [T8l and [T9l we have S m U E' C S*, where I = m|S| n_m + n — m. Hence 
S* contains all words of length m and I; since gcd(m, I) = 1, S* contains all words of length 
> g{m,l). □ 

Remark. We can actually improve the result of the previous theorem to arbitrary m and n, 
thus giving an upper bound in the case where S consists of words of exactly two distinct 
lengths. Details will appear in a later version of the paper. 

Corollary 21. Suppose S C S m U Y7 1 , where < m < n < 2m and gcd(m, n) = 1. Then 
S* is co-finite iff S' m C S and C S* , where I = m|S| n_m + n — m. 

Proof. If S* is co-finite, then by Lemmas [TBI and [T^l we get S m C S and E 1 C S*. On the 
other hand, if S m C S and E' C S*, then since gcd(m, I) = 1, every word of length > g(m, I) 
is contained in S*, so S* is co- finite. □ 

We need one more technical lemma. 

Lemma 22. Suppose S C S m U S n ; where < m < n < 2m, and S* is co-finite. Let r 
be a word not in S* where \r\ = n + jm for some j > 0. Then S* fl (r£ m )*~ 1 r = for 
1 < i < m. 

Proof. As before, since S* is co-finite we must have gcd(m,n) = 1. Define L,i = (rE m ) l_1 r 
for 1 < i < m. We prove that S* fl Lj = by induction on i. 

The base case is i = 1. Then Lj = L\ = {r}. But S* fl {r} by the hypothesis that 
r £ S*. 

Now suppose we have proved the result for some i, i < m — 2, and we want to prove it 
for i + 1. First we show that S* fl £ n ~ m Lj = 0. Assume that uw G S* for some u 6 S n_m 
and w & Li. Then there is a factorization 

w = 2/12/2 • • • J/t (4) 

where G 5 for 1 < h < t. Now |itu;| = n — m + (n + jm + m)(i — 1) + n + jm = 
n(i + 1) + m(ji + i — 2). Since 0<z + l<m, m does not divide |mu|. Thus at least one of 
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the Dh is of length n, for otherwise (jl]) could not be a factorization of uw into elements of S. 
Let r be the smallest index such that \y r \ = n. Then we have 

all of length m of length n 

uw = ym-'-Vr-i 1/^ y r +i ■■■yt- 

Hence 1 2/12/2 • • • Vr\ — m { r — l)+n = mr + n — m. Since, by Lemma [T8l we have S m C S, we 
can write y%- • -y r — uz\ ■ ■ ■ z r , where Zh G S for 1 < h < r. Thus 

uw = y x ■ ■ ■ y r -iy r y r +i ■ • • Vt 
= uz x ■ ■ ■ z r y r+ i ■ ■ ■ y t ; 

and, cancelling the u on both sides, we get w = Z\ ■ • • z r y r+ i ■ ■ -y t . But each term on the 
right is in S, so w G S*. But this contradicts our inductive hypothesis that S* fl Li = 0. 
So now we know that 

S* n S n - m L i = 0; (5) 

we'll use this fact below. 

Now assume that S* fl L i+ i ^ 0. Sincej L i+ i = rE OT Lj, there exists a G S' m and w G L; 
such that raw G S*. Write raw = g\Qi ■ ■ • g p , where gh G S for 1 < h < p. We claim that 
gh G E m for 1 < h < j + 1. For if not, let k be the smallest index such that \gk\ = n. Then 
by comparing lengths, we have 

, , , r 1 ,1 each of length m 
each 01 length m of length n A ° 



5-1^2 ■ ■ ■ 9k-i ' 9k ' ^1^2 ■ ■ ■ 9j-k+i 

for some g[, g' 2 , . . . , g'j_ k+1 G S m . But this shows r G S 1 *, a contradiction. We also have 
gj + i G" S n , for otherwise t = g x - ■ ■ gjgj+i G S 1 *, a contradiction. 

Now either g J+ 2 G S m or gj + 2 G E n . In the former case, by comparing lengths, we see 
that gj + 3 ■ ■ ■ g p G S"~ m Lj. But this contradicts ([5]). In the latter case, by comparing lengths, 
we see gj +s ---g p G L i: contradicting our inductive hypothesis. Thus our assumption that 
S* fl L i+ i 7^ was wrong, and the lemma is proved. □ 

Now we are ready to give a class of examples achieving the bound in Theorem [201 We 
define r(n, k, I) to be the word of length / representing n in base k, possibly with leading 
zeros. For example, r(ll, 2, 5) = 01011. For integers < m < n, we define 

T(m,n) = {r(i, n - m)0 2m ~ n r(i + 1, |S|,n-m) : < i < \E\ n ~ m - 2}. 

For example, over a binary alphabet we have T(3, 5) = {00001, 01010, 10011}. 

Theorem 23. Let m,n be integers with < m < n < 2m and gcd(m, n) = 1, and let 
S = S m U S n — T(m, n) . Then S* is co- finite and the longest words not in S* are of length 
g(m, I), where I = m|E| ra ~ m + n — m. 
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Proof. First, let's prove that S* is co-finite. Since S m C S, by Corollary [21] it suffices to 
show that T} C S*, where / = m|S| n_m + n — m. 
Let x G and write 

£ — yo z oUi z i ■ ■ •Z/|s|"- m -i 2; |s|"- m -iZ/|i;|™- m 

where j/< G S"- m for < z < |£| n -"\ and Zj G £ 2m ~ n for < i < |£|"- m . 

If DiZiDi+i G T(m, n) for all i, < i < |£| n ~ m , then since the base-k expansions are forced 
to match up, we have jji = r(i, |S| , n — m) for < i < |£| n ~ m . But the longest such word is 
of length m|S| n_m + n — 2m < I, a contradiction. Hence yiZiyi + i G S for some i. Thus 





x = \ \\ Vj z 3 \ ViZiVi+l z kVk 



Note that \yjZj\ = \zkVk\ = m. Since S m C S, this gives a factorization of x G S 1 *. Since x 
was arbitrary, we have S' C S 1 *. 

Now we will prove that r G" S*, where 

t := r(0, |S|,n-m)0 2m -V(l, n - m)0 2m - n ■ ■ ■ r(|S| n " m - 1, \E\,n-m). 

Note that \r\ = |S| n - m (n - m) + (|S| n ~ m - l)(2m - n) = m|S|"- m + ri - 2m = / - m. 
Suppose there exists a factorization r = wiu^ • • -w t , where wi G S for 1 < i < t. Since 
|r| is not divisible by m, at least one of these terms is of length n. Let k be the smallest 
index such that Wk G E n . then t — W\ - ■ • Wk-iWkWk+i • • • w t . By comparing lengths, we get 
Wi = r(i — l, n — m)0 2m ~ n for 1 < i < k. Thus Wk = r(k — l, n — m)0 2m ~ n r(k, |E|,n — 
m) G S fl E n . But r(/c — 1, — m)0 2m ~ n r(k, |E|,n — m) G T(m,n), a contradiction. 
Thus r ^ 5*. 

We may now apply Lemma [221 to get that S* omits words of the form (rO m ) m ~ 2 r; these 
words are of length (7 — m + m)(m — 2) + / — m = Im — I — m = g(m, I). This completes the 
proof. □ 

Corollary 24. For each odd integer n > 5, there exists a set of binary words of length at 
most n, such that S* is co-finite and the longest word not in S* is of length f2(n 2 2 n / 2 ). 

Proof. Choose m = {n + l)/2 and apply Theorem [231 □ 

Example 25. Let m = 3, n = 5, S = {0, 1}. Then S = S 3 + S 5 - {00001, 01010, 10011}. 
Then a longest word not in S* is 00001010011 000 00001010011, of length 25. 



7 Number of omitted words 

Recall that f(x\, x 2 , ■ ■ ■ , Xf.) is the classical function which, for positive integers Xi, . . . ,Xk 
with gcd(xi, . . . ,Xk) = 1, counts the number of integers not representable as a non-negative 
integer linear combination of the x^. In this section we consider a generalization of this 
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function to the setting of a free monoid, replacing the integers Xj with finite words in £*, 
and replacing the condition gcd(xi, . . . , Xk) = 1 with the requirement that {x\, . . . , Xk}* be 
co-finite. 

We have already studied this in the case of a unary alphabet in Section [2l so let us assume 
that S has at least two letters. 

Theorem 26. Let x\, X2, ■ ■ ■ , Xk G S* be such that \xi\ < n for 1 < i < n. Let S = 
{xi, X2, ■ ■ ■ , Xk} and suppose S* is co-finite. Then 



M = IE* -5*1 < 



\E\ q - 1 



where q = ^ rT (2 n |E| n - 1). 

Proof. From Theorem [T6], we know that if S* is co-finite, the length of the longest omitted 
word is < q, where q = 2 |sj_i (2"|E| n — 1). The total number of words < q is 1 + |E| + • — h 

1^1 - |E|-1 • U 

We now give an example achieving a doubly-exponential lower bound on Ai. 

Theorem 27. Let m, n be integers with < m < n < 2m and gcd(m, n) = 1, and let 
S = E m U E n — T(m,n), where T was introduced in the previous section. Then S* is 
co-finite and S* omits at least 2' s ' n m — |E| n_m — 1 words. 

Proof. Similar to that of Theorem [231 □ 



8 Conclusion 

We have generalized the classical Frobenius problem on integers to the noncommutative 
setting of a free monoid. Many problems remain, including improving the upper and lower 
bounds presented here, and examining the computational complexity of the associated deci- 
sion problems. We will examine these problems in a future paper. 
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